Current Issue : April-June Volume : 2025 Issue Number : 2 Articles : 5 Articles
Our analysis explores the benefits of artificial intelligence (AI) in music generation, showcasing progress in electronic music, automatic music generation, evolution in music, contributions to music-related disciplines, specific studies, contributions to the renewal of western music, and hardware development and educational applications. The identified methods encompass neural networks, automation and simulation, neuroscience techniques, optimization algorithms, data analysis, and Bayesian models, computational algorithms, and music processing and audio analysis. These approaches signify the complexity and versatility of AI in music creation. The interdisciplinary impact is evident, extending into sound engineering, music therapy, and cognitive neuroscience. Robust frameworks for evaluation include Bayesian models, fractal metrics, and the statistical creator-evaluator. The global reach of this research underscores AI's transformative role in contemporary music, opening avenues for future interdisciplinary exploration and algorithmic enhancements....
It has been found that the existing methods for generating multi-track music fail to meet the market requirements in terms of melody, rhythm and harmony, and most of the generated music does not conform to the basic music theory knowledge. This paper proposes a multi-track music synthesis model that uses the improved WGAN-GP and is guided by music theory rules to generate music works with high musicality to solve the problems mentioned above. Through the improvement of the adversarial loss function and the introduction of the self-attention mechanism, the improved WGANGP is obtained, which is applied to multi-track music synthesis, and both subjective and objective aspects evaluate the performance of the model. The score of multi-track music synthesized by this paper’s model is 8.22, higher than that of real human works, which is 8.04, and the average scores of the four indexes of rhythm, melody, emotion, and harmony are 8.15, 8.27, 7.61, and 8.22, respectively, which are higher than that of the three models of MuseGAN, MTMG, and HRNN, except for the emotion index. The data processing accuracy and error rate of this paper’s model, as well as the training loss value and track matching, are 94.47%, 0.15%, 0.91, and 0.84, respectively, which are better than WGANGP and MuseGAN. The gap between synthesized multi-track music and the music theory rules of real music using the model in this paper is very small, which can fully meet practical needs. The deep learning model constructed in this paper provides a new path for the generation of multi-track music....
Purpose: Cochlear implant (CI) recipients who listen with a hearing aid (HA) in the contralateral ear, known as bimodal listeners, demonstrate individual variability in speech recognition in noise. This variability may be due in part to differences in the processing delays of the CI and HA devices. This study investigated the influence of matching the processing delays of CI and HA devices on masked speech recognition for bimodal listeners. Method: Twelve postlingually deafened adult CI recipients completed a task of masked speech recognition in two listening conditions: (a) independent default CI and HA processing delays (mismatched) and (b) with their HA-specific delay applied to the CI processing delay (matched). Speech recognition was evaluated with AzBio sentences presented in a 10-talker masker at a 0 dB SNR. The target was presented from the front loudspeaker at 0° azimuth, and the masker was co-located with the target, presented 90° toward the CI ear, or presented 90° toward the HA ear. Results: There was a significant main effect for target-to-masker configuration, with better performance when the masker was spatially separated from the target. Better masked speech recognition was observed in the matched condition as compared to the mismatched condition. Conclusion: Bimodal listeners may experience better masked speech recognition when the processing delay of the CI is individualized to match the processing delay of the contralateral HA....
VR is an important engine in the process of informatization road in the education industry, which has advantages that traditional technology can not reach. In this paper, we actively explore audio signal processing technology and VR technology combined with the teaching of musicology majors in a new teaching method, and apply it to the teaching practice. After pre-processing the audio signal and extracting the acoustic parameters, the LVQ neural network is used to construct a music singing evaluation model based on audio signal processing. Based on this, an innovative design for music teaching mode is proposed in combination with VR technology. Taking the music majors of a university as an example, we carry out the application of the VR music teaching model and analyze its application effect. The evaluation results of the LVQ model are very close to the expert scores, with an average difference of 0.214. This can be better used in the evaluation of student singing in music teaching. The test scores of the sample students improved by 0.768 points overall after the experiment, and all dimensions were significantly different at the 1% level, indicating that the VR music teaching mode can effectively promote the music knowledge skills and satisfaction of students in music teaching, and the reasonable use of the VR music teaching mode can effectively improve the effect of music teaching....
Diverse multivariate statistics are powerful tools for musical analysis. A recent study identified relationships among different versions of the composition Sadhukarn from Thailand, Laos, and Cambodia using non-metric multidimensional scaling (NMDS) and cluster analysis. However, the datasets used for NMDS and cluster analysis require musical knowledge and complicated manual conversion of notations. This work aims to (i) evaluate a novel approach based on multivariate statistics of potential note degree of rhyme structure and pillar tone (Look Tok) for musical analysis of the 26 versions of the composition Sadhukarn from Thailand, Laos, and Cambodia; (ii) compare the multivariate results obtained by this novel approach and with the datasets from the published method using manual conversion; and (iii) investigate the impact of normalization on the results obtained by this new method. The result shows that the novel approach established in this study successfully identifies the 26 Sadhukarn versions according to their countries of origin. The results obtained by the novel approach of the full version were comparable to those obtained by the manual conversion approach. The normalization process causes the loss of identity and uniqueness. In conclusion, the novel approach based on the full version can be considered as a useful alternative approach for musical analysis based on multivariate statistics. In addition, it can be applied for other music genres, forms, and styles, as well as other musical instruments....
Loading....